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Abstract —This paper addresses the development of ana¬ 
lytical tools for the computation of the moments of random 
Gram matrices with one side correlation. Such a question 
is mainly driven by applications in signal processing and 
wireless communications wherein such matrices naturally 
arise. In particular, we derive closed-form expressions for 
the inverse moments and show that the obtained results 
can help approximate several performance metrics such 
as the average estimation error corresponding to the 
Best Linear Unbiased Estimator (BLUE) and the Linear 
Minimum Mean Square Error LMMSE or also other 
loss functions used to measure the accuracy of covariance 
matrix estimates. 

Index Terms —Gram matrices, One side correaltion, 
Inverse moments, Linear estimation, BLUE, LMMSE, 
Sample covariance matrix. 

I. Introduction and basic assumptions 

The study of the behavior of random matrices is 
a key question that appears in many disciplines such 
as wireless communication, signal processing and eco¬ 
nomics, to name a few. The main motivation behind 
this question comes from the fundamental role that 
play random matrices in modeling unknown and unpre¬ 
dictable physical quantities. In many situations, mean¬ 
ingful metrics expressed as scalar functionals of these 
random matrices naturally arise. The understanding of 
their behaviour is, however, a difficult task which might 
be out of reach especially when involved random models 
are considered. One approach to tackle this problem is 
represented by the moment method. It basically resorts 
to the Taylor expansion of differentiable functionals in 
order to turn this difficult question into that of computing 
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the moments of random matrices, where the moment 
of a m x m random matrix S refers to the quantities 
ATrE(S r ) for r € Z. Along this line, a large amount of 
works, mainly driven by the recent advances in spectral 
analysis of large dimensional random matrices, have 
considered the computation of the asymptotic moments, 
the term ’’asymptotic” referring to the regime in which 
the dimensions of the underlying random matrix grow 
simultaneously large. Among the existing works in this 
direction, we can cite for instance, the work in HH 
where the computation of the asymptotic moments is 
used to infer the transmit power of multiple signal 
sources, that of |j3} dealing with the asymptotic moments 
of random Vandermonde matrices and finally that of 
Pfll . where the authors studied the asymptotic behavior 
of the moments in order to allow for the design of a 
low complexity receiver with a comparable performance 
to the lineal - minimum mean square error (LMMSE) 
detector. While working under the asymptotic regime has 
enabled the derivation of closed-form expressions for all 
kind of moments, it presents the drawback of being less 
accurate for finite dimensions. Alternatively, one might 
consider the exact approach, which relies on the already 
available expression of the marginal eigenvalues’ density 
of Gram random matrices. Interestingly, this approach, 
despite its seemingly simplicity, has mainly been limited 
to computing the moments of Wishart random matrices 
115161 . To the best of the authors’ knowledge, the case 
of random Gram matrices has never been thoroughly 
investigated. This lies behind the principal motivation 
of the present work. 

In this paper, we consider the derivation of the exact 
moments of random matrices of the form S = H*AH, 
where H is a n x m (n > m) matrix with indepen¬ 
dent and identically distributed (i.i.d.) zero-mean unit 
variance complex Gaussian random entries, and A is 
a fixed n x n positive definite matrix. It is worth 
pointing out that matrix S cannot be classified as a 
Wishart random matrix. However, its positive moments 
-fTrES r , r > 0 coincide with those of the Wishart 


2 


random matrix A 2 HEUA 2 , and can be thus computed 
by using existing results on the moments of Wishart 
matrices. As far as inverse moments arc considered 
(r < 0), the same artifice is of no help, mainly because 
the random matrix A^HHAs becomes singular and 
thus inverse moments cannot be defined. Besides, from 
a theoretical standpoint, computing the inverse moments 
using the Mellin transform derived in Q is not an easy 
task. The crude use of the expression provided in (7| 
brings about singularity issues, as will be demonstrated 
in the course of the paper. Answering to the so-far 
unsolved question of computing the inverse moments of 
Gram random matrices constitutes the main contribution 
of this work. Additionally, based on the obtained closed- 
form expression of the exact moments, we revisit some 
problems in linear estimation. In particular, we provide 
closed-form expressions of the mean square error for 
the best linear unbiased estimator (BLUE) and the linear 
minimum mean square error estimator (LMMSE) in both 
high and low SNR regimes. We also derive as a further 
application the optimal tuning of the windowing factor 
used in covariance matrix estimation. 

The remainder of this paper is organized as follows. 
In Section II, we present the main result of this paper 
giving closed form expressions of the inverse moments. 
In section III, we provide some potential applications and 
discuss some performance metrics. We then conclude the 
paper in section IV. Mathematical details are provided in 
the appendices. 

Notations: Throughout this paper, we use the following 
notations : E (X) stands for the expectation of a random 
quantity X and Ex (/) stands for the expected value 
of / with respect to X. Matrices are denoted by bold 
capital letters, rows and columns of the matrices are 
referred with lower case bold letters (I n is the size-n 
identity matrix). If A is a given matrix, A f and A* 
stand respectively for its transpose and transconjugate. 
For a square matrix A, we respectively denote by Tr (A), 
det (A) and ||A|| its trace, determinant and spectral 
norm. We refer by [A] - ■ the (i,j )th entry of A and by 
diag (ai, a 2 , • • • , a n ) the diagonal matrix with diagonal 
elements, a\, ct 2 , ■ ■ ■ , a n . 

II. Exact Closed-form expression for the 

MOMENTS 

Consider a (n x m) random matrix H with i.i.d zero- 
mean unit variance complex Gaussian random entries 
with m < n. Let A be a deterministic (n x n) positive 
definite matrix with distinct eigenvalues (0\ < • • • < 6 n ) 
and define the Gram matrix S as: 


In this paper, we consider the computation of the mo¬ 
ments fi\ (r) defined as: 

AOv (r) = —Tr (E H {S r }), reZ. (2) 

m 

As we will see later, in contrast to the positive moments 
(r > 0) which can be directly obtained from the marginal 
eigenvalues’ density, the derivation of the inverse mo¬ 
ments (r < 0) is not immediate, requiring a careful 
analysis of the available existing results. 

In the following, we will build on the exact approach 
in order to derive closed-form expressions for the mo¬ 
ments n\ (r). The asymptotic moments will be dealt 
with subsequently in order to illustrate their inefficiency 
in evaluating the moments of the eigenvalues of small 
dimensions Gram matrices. 


A. Closed-form Expressions for the Exact Moments in 
Fixed Dimensions 

The exact calculation of moments is mainly based on 
existing results on the marginal density of the eigenval¬ 
ues of S. These results characterize the Mellin transform 
of the marginal density, the definition of which is given 

by: 

Definition 1. Denote by £ 1 —>• f\ (f ) the marginal density 
distribution of an unordered eigenvalue of S. Then, the 
Mellin transform of j\ (.) is defined as 

noo 

M h (s)± Z'-'fxiQdt. ( 3 ) 

J 0 

With the above definition at hand, we are now in 
position to recall the following Lemma that provides a 
closed-form expression for the Mellin Transform of the 
marginal density of S: 

Lemma 1. /[7j Theorem 2] Let S be as in ([TJ- Then, 

m m 

= T(a+j-l) 

j =1 *=1 

n—m n—m 

-EE [*- 1 ] w er” + * +f - 

1=1 k=l 

(4) 

with L = r( ' } the Gamma fac¬ 

tion and is the (n — m) x (n — rn) Vandermonde 
matrix, 


nn—m+s+j —2 
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s = H*AH. 
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where T>(i,j) is the (i, j)—cofactor of the (m x m) 
matrix C whose ( l,k)—th entry is given by 


n—m n—m 


<* -1)! cztt 1 -EE [*■+,, 

\ p =1 q =1 


v QP~ t an—m+k —1 | 

X a n-m+l a q 


The exact moments p\ ( r ) for r > 0 can be obtained 
as a direct consequence of Lemma Q] by replacing in © 
s by r + 1, thereby yielding the following corollary: 

Corollary 1. For r > 0, the moments fa (r) are given 
by: 

FA (r) = M fx (r + 1) 
where Mf x (r + 1) is given by ©. 

In sharp contrast to the case of positive moments 
(r > 0), the inverse moments can not be obtained 
by a crude substitution of s by — r — 1 for r > 0. 
The problem essentially stems from the terms in the 
sum wherein the Gamma function is applied to negative 
integers on which it is not defined. This might give 
the impression that the inverse moments arc infinite and 
cannot be thus computed. Such a quick conclusion goes, 
however, against the existing results on inverse moments 
available for wishart matrices, thus leading us to suspect 
the effect of the Gamma function to be cancelled out 
in one way or another. In order to study the expected 
compensation effect, it is natural to analyze the behavior 
of M.f x (s — r + 1) for small values of s. If a limit exists 
as s goes to zero, one might expect it to coincide with the 
sought-for value of the r-th moment. Such an intuition 
is confirmed by theory under some conditions on r as it 
can be shown from the following Lemma. 

Lemma 2. If r >n — m, then the limit lim^o Mf x (s — 
r + 1). exists and 


Fa{—p) and ^ J 0 °° x~ r p(x)dx From Theorem 5.4 in 
l8l . we know that x~ r p(x) is integrable provided that 
r < n — m. Therefore, for r < n — m, fa (—r) is finite 
and satisfies: 

FA (—r) = lim M/ X (s — r + 1). 


From Lemma 0 we can see that the computation of 
the moments fa(~ r) amounts to evaluating the limit of 
M f x (s—r+1) as s goes to zero. A careful scrutiny of the 
expression of Mf x (s — r + 1) reveals that the sum over 
index j makes appear two types of terms. The first type 
corresponds to those indices of j for which —r + j — 1 
is positive. The limits of these terms can be computed 
normally by setting s to 0 since T(—r+j — 1) is properly 
defined. The second type of terms is more difficult to 
analyze, since it corresponds to those indices of j for 
which —r+j — 1 is negative. Obviously, these two types 
of terms cannot be handled similarly. In light of this 
observation, it is sensible to decompose Mf x (s — r + 1) 
as the sum of two quantities depending on the value of j, 
whether it is below or above r + 1. This decomposition 
writes as: 


Mf x (s - t + 1) = M\ (s - r + 1) + M 2 (s-r + 1), 

( 6 ) 

where 


j =1 *=1 


qn—m+s+j —2 
7 n— m-\-i 


n—m n—m 


E E i* % 

1=1 k=L 
m m 


1 ] a n-m+s+j-2pk-l 


'l 


J n—m+i 


M 2 (s) = L Y. E p ( i ’j) r ( s + -?'- 1 ) 

j=r -\-1 i =1 


qn—m+s+j —2 
7 n— m+i 


n—m n—m 


E E [*++r" + ' +J_ 2 <fc+, 


1=1 k=1 


FA(—r) = lim Mt x (s — r + 1). 
s4.o 

Proof: Let x i->- p(x) be the probability density 
function corresponding to the smallest eigenvalue of 
HH*. Then, obviously, 

i r°° 

Mf x (s — r + 1) < — / x s r p(x)dx. (5) 

VI Jo 

It ensues from the Monotone convergence theorem ap¬ 
plied to the sequence of functions (x i-> x s ~ r p(x)) s > o 
and (x^x s r f(x)) s>0 that if limits for the both hand 
sides of © exist, they must be equal respectively to 


We will first handle the second term M 2 (s — r + 1), 
gathering indices j for which —r + j — 1 is positive. 
Interestingly, we can prove that its limit is zero as s | 0, 
which shows that it does not contribute in the expression 
of the final moment. 

Proposition 1. The term M 2 (s — r + 1) vanishes as s 
goes to zero i.e, 

lim M 2 (s — r + 1) = 0, r = 1, ■ ■ ■ ,m. 

s—»o 

Proof: See Appendix A for the proof. ■ 

The expression of the moment is thus totally ruled 
out by the contribution of the first term M\ (s — r + 1). 
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Before providing the expression of its limit as s j, 0, we 
shall introduce the following notations: 


a, = 


nn—m—r+j —1 rsn—m—r+j — 1 

'l 5 “2 5 


D ?: = diag |tog 


hn—m+i 


.log 


Qin—m—r+j —1 
5 ” n—m 


'n—m+i 


11 


02 


-■ lo 8 fe)) 

u. A h r\ nn—m— 11* 

u-j — [i, t7 n -m+j, ‘ ‘ ‘ ,u n _ m+ jj ■ 


With these notations at hand, we arc now in position to 
state the following result: 

Proposition 2. Let p = min (m, n — rn), then for 1 < 
r < p we have 


lim Mi (s — r + 1 ) 


= L EE V ^^ 

3=1 i= 1 


{r- j)\ 


b> 1 T),a ( . 


Proof: See Appendix B for a detailed proof. ■ 
Combining the findings of the above propositions, we 
finally obtain the following result: 

Theorem 1. For 1 < r < p, we have 


B. Asymptotic Inverse Moments 

It is well-known from standard results on random 
matrix theory that moments of Gram random matrices, 
can be well-approximated, when properly scaled and for 
m and n large enough, by deterministic quantities. How¬ 
ever, the derivation of these deterministic approximations 
differs from the exact approach in several respects, 
namely it does not rely on the same tool of the Mellin 
transform and does not necessarily yield simple closed- 
form expressions for any high order moment. As a matter 
of fact, it is shown in 0] that except the special case of 
A coinciding with the identity matrix, the computation 
of higher positive order moments has to be performed 
iteratively. This also holds for the case of asymptotic 
inverse moments, which can be derived using the same 
approach as in @. This represents the main goal of 
this section, for which details will be provided for sake 
of completeness. The obtained asymptotic moments will 
be compared later with the exact ones derived in the 
previous section. 

Prior to stating the main algorithm leading to the 
asymptotic inverse moments, we shall first review the 
following results from random matrix theory. 

Definition 2. (Empirical Spectral Distribution) Let 
A £ C mxm be a Hermitian matrix with eigenvalues 
Ai, A 2 , • ■ ■ , A m- The empirical spectral distribution F A 
of A is defined as 


MA (-r) 


r m 

L EE v ^n 

3=1 i= 1 






III 

F A (x) = 1 (A < X) (9) 

i=l 


Remark 1. Without loss of generality, we can easily 
show that matrix A can be considered as diagonal with 
diagonal elements 0 1 , • • • . (9/y. This can be seen from the 
eigendecomposition of A as follows 

A = U*DU, (7) 

where D = diag (0\ , 62, ■ ■ ■ ,0 n ) and U is a unitary 
matrix, i.e. U*U = UU* = I n . Then, 

S = H*U*DUH 

= (UH)’D(UH) 

= G*DG, 


Working directly on the empirical distribution function 
F a is in general a tedious task. Instead, a character¬ 
ization of its Stieltjes transform is often considered. 
The Stieltjes transform corresponding to the empirical 
distribution F A is defined as: 


Definition 3. (Stieltjes Transform) For a hermitian ma¬ 
trix A, the Stieltjes transform is defined as 


rhA (z) = I -^—dF A (A) 

= — 7r(A - zlm)- 1 
m 

From definition [3] it is easy to prove 


( 10 ) 


where G = UH. Since the wishart distribution is 
unitarily invariant, G has the same distribution as H. 
Therefore, 

PA (r) = - Tr (E c {(G*DG) r }), r £ Z (8) 
m 


( = —Tr (A-<‘+‘> 

V dz ) 2=0 m V 


( 11 ) 


Let D be the diagonal matrix as defined in ©. Then, 
the Stieltjes Transform (ST) of the empirical measure 
of ^-S converges to a deterministic measure whose ST 
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m ( 2 ) is the solution of the following fixed point equation 
191: 

1 


m (z) = 


-* + £ ELt 


[D1 


( 12 ) 


1 +[D], fc m(z) 


Denote by m( r ' ) the r-th derivative of m(z) at z = 0. 
Along the same arguments as in 0|, we can prove that 
is a consistent estimate of r!m r Tr (H*AH)~^' +1) . 
This suggests in particular estimating the scaled inverse 
moments ^-Tr((^-S) ') by 3, mS r \ Closed-form expres¬ 
sions for the derivatives of m/ r -' do not exist, but they 
can be numerically computed recursively using the result 
of the following Theorem. 



Theorem 2. Let p > 1 and f k (z ) = — , . r^rr 1 —rr- 

y J 1+[D } k<k m(z) 

Denote by the p-th derivative of fk(z) at z = 0. 
Then, the following relations hold true: 


Figure 1. Inverse moments for A defined as in {H: A compari¬ 
son between theoretical result (Theorem 1), normalized asymptotic 
moments (Theroem 2) and Montecarlo simulations (10 4 realizations). 


pm 


(P-1) 


[DW) 


^! + [D ] k , k 


( 0 ) 


m 


m (0) 


+ 


n p -1 

USE 

k= 1 1=1 


[ D ]fc,fc m 




l + [D] M m(0) 


= 0, 


(13) 


Montecarlo simulations. 

We start by verifying the result in Figure 1, in the case 
where A is a correlation matrix having the following 
structure 



(15) 


,(p) ^ ( P \ [D] fc , fc m (i) /r° 

4 1 + \P) kik m(0) 1 + [D] fe>fc m(0) ■ 

(14) 

Proof: See Appendix C for detailed proof. ■ 
Based on the previous theorem, an algorithm can be 
provided in order to recursively compute the higher order 
derivatives, rrS k \ These values will thus immediately 
serve to compute the deterministic approximations for 
the moments. 

Algorithm 1 Asymptotic inverse moments computation 
1: Compute m(0) using (fl2l) 

2: Compute f k (0) = ~ 1+[D] ^ fcm(0) 

3: for i = 1 — > p do 
4: compute mPl using (fl3l) 

5: compute using (fl4l) 

6: end for 


where Jo (.) is the zero-order Bessel function of the first 
kind. This kind of matrices is used to model the corre- 
. lation between transmit antennas in a dense scattering 
environment. For simulations, we set m = 3 and vary n 
such that n > m. 

In Figure 2, we compare the same quantities in the case 
where A is a random positive definite matrix generated 
as follow^ 

A = I n + W*W, (16) 

where W is a (n x n) matrix with i.i.d zero-mean unit 
variance complex Gaussian random entries. 

In both Figures 1 and 2, the theoretical result of Theorem 

1 perfectly matches the montecarlo simulations, however 
the normalized asymptotic moments derived in Theorem 

2 only matches the exact results for the case where 
r = —1, —2 where the model in (fl5l) is adopted. This 
can be explained by the fact that the use of normalized 
asymptotic moments lead to inaccurate results in the 
regime of fixed m and n and that the result of Theorem 
1 is more suitable in this case. 


C. Numerical Examples 

We validate the theoretical result stated in Theorem 
1 for different values of m and n. In particular we 
compare p\ (—rj with the normalized asymptotic mo- 
ments T ,^ r+1 and the empirical moments obtained by 


III. Applications of The Inverse moments 

The computation of inverse moments of one-sided 
correlated Gram matrices is paramount to many appli¬ 
cations of signal processing. For sake of illustration, 

'We use such matrices to make sure that the obtained results are 
applicable for broad class of positive definite matrices. 
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Figure 2. Inverse moments for A defined as in {Hi: A compari¬ 
son between theoretical result (Theorem 1), normalized asymptotic 
moments (Theroem 2) and Montecarlo simulations (10 4 realizations). 

we will discuss in the sequel applications of our results 
to the fields of linear estimation and covariance matrix 
estimation. 

A. Linear Estimation 

The problem of estimating an unknown signal from 
a sequence of observations has been widely studied 
in the literature iflOl — lTT2l and can be solved if joint 
statistics or cross correlations of the unknown signal and 
the observations vector arc available. In this line, linear 
models are a special case where, the input and the output 
are linearly related as 

y = Hx + z, (17) 

where y € C nxl is the observations vector, H G C nxm 
the channel matrix, x G C mxl the unknown signal 
with covariance matrix S x and z G C nxl the noise 
vector with covariance matrix S 2 . As stated earlier, in 
order to recover x, joint statistics are required. However, 
acquiring joint statistics is generally a difficult task 
either because of the unknown nature of the signal or 
simply because of the unavailability of the statistics. To 
overcome this issue, linear estimators can be viewed as 
a good alternative. They are merely based on applying 
a linear transformation to the observation vector. Obvi¬ 
ously, this is a sub-optimal strategy in regards of the 
minimization of the mean square error, but it is more 
tractable and permits to explicitly analyze performances. 
In Table I, we review the explicit expressions of the 
unknown signal for different estimation techniques de¬ 
pending on the informations available about the signal 
and the noise statistics. In what follows, we make the 
following assumptions: 


. H is a (n x m) matrix with i.i.d complex zero mean 
unit variance Gaussian random entries 
• z is a (n x 1) zero mean additive Gaussian noise 
with covariance matrix "E z = E{zz*}, i.e. 
z ~ CJ\f (0 n , S 2 ). 

1) An Exact expression for The BLUE Average Esti¬ 
mation Error: Let n > m and consider the same linear 
system as in (fTTT ). With the the noise covariance matrix 

at hand, the best linear unbiased estimator (BLUE) 
urn recovers x as: 

x««e = (H*S7 1 H)" 1 H*I]7 1 y 

= x+(H*E- 1 H) _1 H*£- 1 z (18) 

X + &bl ue - 1 

where euue = (EPS^El) 1 H*Ej 1 z is the resid¬ 
ual error after applying the BLUE. We denote by 
£ e,blue = } the covariance matrix of e hlue , 

then ^e,blue = (EPXl^H) 1 , Using the result of 
Theorem 1, the average estimation error is thus given 
by: 

Ewrllxwne - x l| 2 } = EliTr (£ e ,Wue) 

= E H Tr((H*E; 1 H)" 1 ) (19) 

= rnp\ (- 1 ), 

where A = Er 1 . 

For simulation purposes, we set m = 3, and consider 
A as in (fl5T) and (fl6l) . Then, we compare the empirical 
average estimation error using Montecarlo simulaton for 
different values of n with the theoretical result derived 
in Theorem 1. As shown in Figure 3, the theoretical 
performance exactly matches the exact performance of 
the BLUE in terms of average estimation error for both 
models of A in (fl5l) and (fl6l) . 

2 ) Approximation of the LMMSE average estimation 
error: Consider the linear system as in (fT71) . where we 
assume additionally that the covariance matrix of the 
unknown signal x is known and given by £ x . The linear 
minimum mean square error estimate (LMMSE) of x is 
thus given by: 

timmse = (^f l + H*E^H) _1 iTE^y (20) 

Consequently, the estimation error can be calculated as 

eimmse = (S; 1 + H*S^H) - 1 H*E“ 1 z (21) 

By standard computations and based on the result derived 
in CD, the error covariance matrix is given by 

£e,I mmse = + EUE^H) _1 (22) 












Table I 

Linear Estimation Techniques depending on the available Statistics 
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Linear estima¬ 
tor 

Required 

Statistics 

Estimated signal, x 

Error covariance matrix, 
E(x — x) (x — x)* 

LS 

0 

(H’H) -1 H*y 

(ELH)” 1 

BLUE 


(H , s; 1 H)"‘H*E7 1 y 

(H’ENh) -1 

LMMSE 

See 

(Ef 1 + LLEiNH) -1 H*S“ 1 y 

I^+itsFh y 1 
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■ - - Theoretical approximation (Low SNR) 
--Theoretical approximation (High SNR) 
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Figure 3. BLUE average estimation error for m = 3: Montecarlo 
simulation (10 4 realizations) versus theory (Theorem 1) 


Figure 4. LMMSE mean square error with as in G5): Montecarlo 
simulation versus theoretical approximation for the low and high SNR 
regimes. 


Therefore, the average estimation error for the LMMSE 
estimator is given by 

||xf mm ,g e x|| } = IEhTV (5j ejmmse) 

= E H Tr ((E- 1 + H*£- 1 H) _r ) 

(23) 

Evaluating the LMMSE average estimation error is in 
general very difficult, however, for the simple case where 
£ x = a 2 I n , it is possible to obtain an approximation 
depending on the value of < 7 ^. In the following theorem, 
we provide tight approximations for the LMMSE average 
estimation error for the cases: a 2 S> 1 (High SNR 
regime) and a 2 -C 1 (Low SNR regime). 

Theorem 3. Let A = Sj 1 . Then, the LMMSE average 
estimation error at both the high SNR regime (a 2 3> 1) 
and the low SNR regime (rr 2 1 ) is given by 

1) High SNR regime: 
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—Theoretical approximation (Low SNR) 
-Theoretical approximation (High SNR) 
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Figure 5. LMMSE mean square error with Y, z as in filth Montecarlo 
simulation versus theoretical approximation for the low and high SNR 
regimes. 


2) Low SNR regime: 


Eh{||x^ 


-x|| 2 } 


= D + ofc 2 '-) <24) 

k =0 ° x 


EH{||x immse - x|| 2 } = {-l) k a 2k+2 p\ (k) 

k =0 


( 25 ) 


where l < p — 1 with p = min(m, n — m). 


Proof: See Appendix D for Proof. 


































To validate the approximations derived in Theroem 
3, we set m = 3 and n = 10 and apply the obtained 
results for both correlation models in ( fT5l ) and ([T6l ). For 
that in Figures 4 and 5, we compare the mean square 
error of the LMMSE using Montecarlo simulations with 
the approximations derived in Theorem 3. As shown in 
Figures 4 and 5, the approximation is quiet tight in both 
high and low SNR regimes and almost cover the whole 
SNR range. 


B. Sample Correlation Matrix (SCM) 

Estimation of covariance matrices is of fundamental 
importance to several adpative processing applications. 
Assume that the measurements are arranged into an input 
vector u E C mxl called also the observation vector. If 
the input process is zero-mean, its covariance matrix is 
given by: 

R = E{uu*}, (26) 


where the expectation is taken over all realization of the 
input. The covariance matrix R is usually unknown, and 
thus has to be estimated. Assuming the input process to 
be ergodic, the covariance matrix can be estimated via 
time averaging. A well-known estimator is the sample 
correlation matrix (SCM) which is given by lfT3l 

1 n 

R (n) = - ^2 u (&) u * (k ), (27) 

k =1 


This is called rectangularly windowed SCM, where u (k) 
is the input vector at discrete time k and n is the 
length of the observation window. When the observations 
are Gaussian distributed, the SCM is the maximum 
likelihood (ML) estimator of the correlation matrix II141 . 
Moreover, for a fixed and finite input size m, as the 
window size n —>• oo, the SCM converges to the input 
correlation matrix fl5ll , in the sense that: 


R — R (n) 


0 , 


a.s. 


(28) 


where ||.|| is a spectral norm of a matrix. 

Flowever, the number of measurement is usually finite for 
practical applications. Thus, it is for a practical interest to 
evaluate the performance of the SCM when the window 
size is finite. In order to measure the accuracy of the 
estimator, we define the average loss as the average 
distance between the input correlation matrix and its 
estimated version using SCM for a given window size n 

IH: 


Loss (n) = E R 2 R 1 (ra) R^ — I r 


(29) 


where R 2 is a positive semi-definite square root of R 
and ||.|| F is the Forbenius norm of a matrix. 


In order to emphasize some measurements relevant for 
the estimation of the correlation matrix, an exponentially 
weighted SCM can be used and it is given by Ifl3l : 

n 

R(n) = (1 - X)^2\ n ~ k u{k)u* {k), (30) 

k =1 

where \ n ~ k is the weight associated to the measurement 
vector at time instant k, the coefficient A being the 
forgetting factor. In the case where u (k) is modeled as a 
colored process u (k) = R^x (k), where x (k) E C mxl 
is a vector of i.i.d Gaussian zero mean, unit variance 
entries, the SCM can be written in a matrix form as 

R(n) = R^XA(n)X*R3, (31) 


where Xismxn matrix whose Ath column is x (/;•) 
and A (n) = (1 — A) diag (A n_1 , A n ~ 2 , ■ ■ ■ , l). In the 
following, we prove that we can derive a closed-form 
expression for the loss function defined in (1291) . Let 
S n = XA (n) X*. Then, the loss can be expressed as 


Loss (n) = E ||S„ 1 — I 1 

[s - 1 - 

= ETr S~ 2 — 2S~ 1 +1 


m 11 p 

= ETr [S” 1 — I m ] * [S” 1 — I, 


= m + ETr [S" 2 ] - 2ETr [S” 1 ] 

= m + mp A ( n) (-2) - 2 mp A(n) (-1) 
= m(l + ^A(n) (-2) - 2 p Mn) (-1)) . 


(32) 


One interesting problem is to find the optimal A E 
(0,1) denoted by A* that minimizes the loss function. 
This can be performed by evaluating the loss function 
with respect to A E (0,1) and then picking the A that 
gives the lowest loss. To evaluate the loss function, 
one can resort to MonteCarlo simulations. This would 
involve high complexity since they should be repeated 
for each value of A. The use of the provided closed-form 
expression represent thus a valuable alternative being at 
the same time accurate and easier to implement. For 
m = 3 and n = 10, we plot the estimation loss as a 
function of A using the theoretical expression derived in 
(l32l) . As shown in Figure 1, for all cases a minimum exist 
and thus the performance can be optimized accordingly. 


IV. Conclusion 

In this paper, we derived closed form expressions for 
the inverse order moments of general Gram matrices with 
one side correlation. Based on this formula, the exact 
average estimation error of the BLUE estimator has been 
derived and an accurate approximation for the LMMSE 
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Appendix B 


-m=3, n=5 
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A 


Figure 6. The estimation loss as a function of A: Theoretical 
expression in 01- 


average estimation error was proposed in both high and 
low SNR regimes. Additionally, we have shown that our 
results can be used to evaluate the accuracy of covariance 
matrix estimates. 


Appendix A 

Proof of Proposition 1 
As stated in section II, M . 2 (s) is given by: 


Proof of Proposition 2 


The handling of M\ (s — r + 1) is delicate because it 
involves evaluation of the Gamma function at negative 
integers. Hopefully, a compensation effect occurs due to 
the multiplicative term in front of the Gamma function. 
The proof relies on a divide and conquer strategy that 
consists in decomposing (s — r + 1) into a sum of 
terms and then evaluating each term separately. To this 
end, we need to introduce the following notations. 


V* 


A 


gs ^1+s . . . gn—m+s—1 


gs al+s . . . gn—m+s—1 

"n—m u n—m ' ' ' u n—m 


a 


_A 

8,3 ~ 


nn—m+s—r+j—1 nn—m+s—r+j —1 

“l ) "2 ’ 


t 


• ••, e 


n—m+s—r+j —1 
n—m 


u . A [as /)l+s ... gn—m+s— 11 * 

— [_ 0 n—m+ii u n—m+i'' ' ' i u n—m+i J 

l A [1 a gn—m —11 * 

— L A > °n-m+i, , c n-m+i J 

e k — [0, • • • ,0,1, zeros (A:)]*, k = 0, • • • ,n — m — 1. 

(33) 


M 2 (s) = l J 2 Y v ^^^ + j~ l ) 

j=r +1 i =1 


qn—m+s+j- 


n—m n—m 


\i e l 


,n—m+s+j—‘2nk—l 


n-m-\-i I * 


m m 


= L Y Y v ( i ’j) r (- r+ j) 

j=r+l i=1 


Using the previously defined varaibles, we can rewrite 
~ 2 M. \ (s — r + 1) as in 1341 ) (on top of the next page). 
The first term in equation (l34l ) is equal to zero. 
This can be seen by noticing that 'P s e r _j = a a j and 


b( ,e r _ , = r+J . Thus, 'P s 'a. SJ = e r _j and 


1=1 k= 1 / 

The handling of M. 2 {s ) does not pose any difficulty, 
the Gamma function being applied to non-negative ar¬ 
guments. Interestingly, we can show that this term turns 
out to be equal to zero as s goes to zero. To this end, 
notice that: 

lim M 2 (s — r + 1) 

s->-0 


_ ^n—m+s—r+j —1 

J s,i^ r —J ®n—m+i 

consequently b^'R- , a. SJ = 

It remains thus to deal with the last two terms. Using 
a Taylor approximation of b v ; as s approaching 0, we 
have 


b s ,j - b i = s 


log (0 n —m+i) , 0n—m+i log m+/) , 


’ ®n-m+i 1°S ( 0n-m+i ) 


+ o(s) 


/ n—m n—m > 

x taT H -E E [ rl ] tl ®r ra ' r+J_1 tL +i 


= S log (0 n —m+i) bi + O (s) . 


1=1 k=l 


(35) 


m m 

= i E EPIuPliJ-r 

j=r+1 i =1 
m 

= L Y > y +,., 

i—r+1 

where T> and C are as dehned in Uemma Q] Since V 
is the cofactor of C, then V t C = det (C) I m . Therefore, 

[ p E jd - r = 0 fo r j = r + 1 , • • • , m. 


To deal with the Gamma function evaluated at non 
positive integers, we rely on the result of the following 
lemma. 


Lemma 3. / 1 7 71/ For non positive arguments —k,k = 
0,1, 2, • • •, the Gamma function can be evaluated as 


lim Flffl 
s—>0 r ( s ) 


(-+ 


k\ 


(36) 
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M, (, - r + o = L E E V ( j .i) r (» - >• + i) CST 4 ^ 1 

(-1 i—1 \ 


,n-m+s—r+j-1 _ r^jy-1] an—m+s—r+j—lnk— 1 

n—m+i / / / / \k,l u l U n—m+i 


1=1 k =1 


j=l t=l 


£ E E P (O) r (» - r + j) [ - b!*-'a.j 


i=l i=l 

x b, ( tPr 1 - VP' 1 ) a. 


+iEE D <‘ ) i) r (s — r + j) 


j =1 i=l 


*EE®<‘ ,j)r(s-r + j) 0; 


i=l*=l 


in-m+s-r+j-l _ , f qy-1, 
7 n—m.+j u s,i ^ s 1 


'•'/iv •' +iEE D < i ,j)T(s-r + j) 


3 = 1 * =1 


Ki-b‘) t^+iEE'S ,j)r(s - r + j)b- (^ s 1 - *)a sJ 


j =1 *=1 


where T (s) = ^ + o(s) ns s approaches 0. 

Thus, T (s — r + j) = -m + o(s). Therefore , as 
v J ' s ^0 S ( r -J) ! w 

s approaches 0, we have 

T (s - r + j) (bE - b*) ^7 1 a s j 

= (- 1 ) r ~ Jlog (^-m+i) b ^-i + o(s) 

{r-j)\ 

Finally, to deal with the last term, we use the following 
resolvent identity : 

^r - 1 _ if 1 = vf — 1 (^r _ if? s ) vp — 1 

We also make use of the fact that as s approaches 0: 


This expression can be further simplified by noticing that 
\P\P 1 = <P. Finally, we have 

lim M\ (s — r + 1) 

s —>0 

r m ( . r -j V 

_ T \ ' \ 'T'l { ■ \ V V ul.T.-1 1_ I’D \ T 


= i EE V (*>•?) ( _ V b ^ 1 lt)g ( e n-m+i) I n-r 

3 =1 i= 1 lr J) ' 

- $1 a, 


r m y-j 

i EE z ’( < 'j)^)f b S*“ lD “>i 


3 = 1*=i 


where 


-’SO = - S \P + o(s) 

s->0 


\P = cpip 


with = diag(log(0i),log(0 2 ),--- ,log(0 n _ m )). 

Thus, as s approaches 0, we have 

T(s-r+j) b* (vf; 1 - tf 1 ) a Sij 
(-l) r+1 -3 

= a j+°( S )' 

s—ro (r - j)\ 

Finally, we have the following limit 


thereby ending up the proof of the proposition. 


Appendix C 
Proof of Theorem 2 

We start the proof by noticing that m (z) satisfies: 

n 

. . n 1 v-^ 1 

— zm(z) H-> --— -— = 1. 

m m ^ 1 + [D\ kk m(z) 

Let f k (z) = — 1+ [ D ]* m( z ) ' Then ' the above e q uation 


becomes: 


lim M\ (s — r + 1) 

s —S'-O 

r m 

3=1 *=1 


(~l) r ~' 7 log ifin-m+i) 


(■ r-j )! 


+ y , ... bjqrtvpvp'a . 

(r-j)l J 


-zm (z) + — + — f k (z) = 1. 
m m 

k =1 

Taking the p — 1 derivative of the above equation and set 
2 = 0, we have: 

1 n 

pm (p_1) (0) = -V f[ p) (0). (37) 



11 


On the other hand, functions /). (z) satisfy: 

-f k {z)-\D] kk f k {z)m{z) = l. (38) 
Taking the p-th derivative of equation (l38l) . we get: 


1=0 


P l )ft V>. 


or equivalently, 


/fc £?Wl + IP]*, fc 2& 


(o) 


= 0 


(39) 


Hence, by separating the first term of the sum in 
we obtain 


ft r1 + 


[D 


k.k 


m 


ip) f(°) 


1 + [°]fc,fc m (0) 


^ ^ [D] fc , fc m (0 / fc (p 0 = Q 

i=i 


l ) 1 + [D] fc ,m(0) 


Combining (l37l) and (l39l ) and taking the sum over k of 
the above equation, we get: 


prrS p ^ H- 


m {p) ^ L^J k,kJk 


E 




( 0 ) 


m 1 + [ D ]fc,fc"f(0) 


n p— 1 




p 


m *■—' z —' v l J 1 + [D 

k= 1 1=1 x ' 


[D } k , k in {l) fl? 0 


Al,/c 


m (0) 


= 0, 


(40) 


where m(0) is the unique solution to the fixed point 
equation: 


771 (0) = 


1 


m Z-/k=l 


1D1 


(41) 


l+[D] fc , fc m(0) 

This ends up the proof of the Theorem. 


Appendix D 
Proof of Theorem 4 


Since ^ < 1, then by Tayor expansion around 0, 


the trace of X 

e,lmmse 

can be expressed as: 


Tl* (^e,lmmse) 

00 / 

= E- 

“Lin 

af 1 

(s-*- 1 ), 

(42) 


k= 0 





=e ! 

k= 0 

-9‘rri 

a 2k 1 

^ X 

(S + o( 

f 1 5 

\°x) 





(43) 


where l < p — 1 is the order at which we truncate 
the Taylor expansion. Note that we impose the con¬ 
dition r < p — 1 since the moments (—k — 1) 
are only defined for 1 < k < p— 1. Upon applying 
the expectation, we get: 

1 (—l) fc 

||x; mmse -^-11 ^ ' 2k MA ( k 1) 

k =0 ^ 

+ ° ( ct .t 2Z ) 

(44) 


2) Low SNR regime (<r^ <C 1): 

We proceed similarly as the high SNR regime. For 
that, we make some manipulations on E e / mmse as 
follows: 


J e,lmmse 


= o In + S 


at 


= al(l n + a 2 x S) 1 
= a 2 x (a 2 I n + S- 1 )- 1 S" 1 


Using the same approach for proving Theorem [3] 
1), we get 


{11 ^Immse ~ x|| 2 } = 777 ^ (-1)* CT 2fc+2 // A ( k) 
k =0 

(45) 

As seen in the previous equation, we retain all 
the positive moments since they exist and can be 
computed by means of Theorem 1. 

This completes the proof of Theorem [3] 


The proof of this theorem is based on a Taylor 
approximation of the LMMSE average error. 

1) High SNR regime ( a 2 2> 1): 

As stated in equation (l22l) 


J e,lmmse 


= ( -^I n + H*E7 1 H 


at 


-l 


By setting XL 1 = A, we have 


^ejmmse — 9 In T S 


-1 
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